Efficient client-server based implementations of mobile speech recognition services

نویسندگان

  • Richard C. Rose
  • Iker Arizmendi
چکیده

The purpose of this paper is to demonstrate the efficiencies that can be achieved when automatic speech recognition (ASR) applications are provided to large user populations using client-server implementations of interactive voice services. It is shown that, through proper design of a client-server framework, excellent overall system performance can be obtained with minimal demands on the computing resources that are allocated to ASR. System performance is considered in the paper in terms of both ASR speed and accuracy in multi-user scenarios. An ASR resource allocation strategy is presented that maintains sub-second average speech recognition response latencies observed by users even as the number of concurrent users exceeds the available number of ASR servers by more than an order of magnitude. An architecture for unsupervised estimation of user-specific feature space adaptation and normalization algorithms is also described and evaluated. Significant reductions in ASR word error rate were obtained by applying these techniques to utterances collected from users of hand-held mobile devices. These results are important because, while there is a large body of work addressing the speed and accuracy of individual ASR decoders, there has been very little effort applied to dealing with the same issues when a large number of ASR decoders are used in multi-user scenarios. Preprint submitted to Elsevier Science 5 May 2006

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Speech Recognition on Mobile Communication Networks

As mobile devices become pervasive and small, the design of efficient user interfaces is rapidly developing into a major issue. The expectation for speech-centric interfaces has stimulated a great interest in deploying automatic speech recognition (ASR) on devices like mobile phones, PDAs and automobiles. Mobile devices are characterised as having limited computational power, memory size and ba...

متن کامل

Development of client-server speech translation system on a multi-lingual speech communication platform

This paper describes a client-server speech-to-speech translation system developed on a multi-lingual speech communication platform. This platform enables easy assembly of speech communication system from the corresponding software modules (e.g. speech recognition, spoken language machine-translation, speech synthesis). This client-server speech translation system is designed for use at mobile ...

متن کامل

Robust speech recognition in client-server scenarios

This paper addresses issues that are specific to the implementation of automatic speech recognition (ASR) applications and services in client-server scenarios. It is assumed in all of these scenarios that functionality in a human-machine dialog system is distributed between mobile client devices and network based multi-user media and application servers. It is argued that, while there has alrea...

متن کامل

Acoustic Model and Language Model Adaptation for a Mobile Dictation Service

Automatic speech recognition is the machine-based method of converting speech to text. MobiDic is a mobile dictation service which uses a server-side speech recognition system to convert speech recorded on a mobile phone to readable and editable text notes. In this work, performance of the TKK speech recognition system has been evaluated on law-related speech recorded on a mobile phone with the...

متن کامل

Internet Chinese information retrieval using unconstrained Mandarin speech queries based on a client-server architecture and a PAT-tree-based language model

In order to pursue high performance of Chinese information access on the Internet, this paper presents an attractive approach with a successful integration of efficient speech recognition and information retrieval techniques. A working system based on the proposed approach for speech retrieval of real-time Chinese netnews services has been implemented and tested. Very exciting performance has b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Speech Communication

دوره 48  شماره 

صفحات  -

تاریخ انتشار 2006